49 research outputs found

    Low Power Elliptic Curve Cryptography

    Get PDF
    This M.S. thesis introduces new modulus scaling techniques for transforming a class of primes into special forms which enable efficient arithmetic. The scaling technique may be used to improve multiplication and inversion in finite fields. We present an efficient inversion algorithm that utilizes the structure of a scaled modulus. Our inversion algorithm exhibits superior performance to the Euclidean algorithm and lends itself to efficient hardware implementation due to its simplicity. Using the scaled modulus technique and our specialized inversion algorithm we develop an elliptic curve processor architecture. The resulting architecture successfully utilizes redundant representation of elements in GF(p) and provides a low-power, high speed, and small footprint specialized elliptic curve implementation. We also introduce a unified Montgomery multiplier architecture working on the extension fields GF(p), GF(2) and GF(3). With the increasing research activity for identity based encryption schemes, there has been an increasing need for arithmetic operations in field GF(3). Since we based our research on low-power and small footprint applications, we designed a unified architecture rather than having a seperate hardware for GF{3}. To the best of our knowledge, this is the first time a unified architecture was built working on three different extension fields

    A versatile Montgomery multiplier architecture with characteristic three support

    Get PDF
    We present a novel unified core design which is extended to realize Montgomery multiplication in the fields GF(2n), GF(3m), and GF(p). Our unified design supports RSA and elliptic curve schemes, as well as the identity-based encryption which requires a pairing computation on an elliptic curve. The architecture is pipelined and is highly scalable. The unified core utilizes the redundant signed digit representation to reduce the critical path delay. While the carry-save representation used in classical unified architectures is only good for addition and multiplication operations, the redundant signed digit representation also facilitates efficient computation of comparison and subtraction operations besides addition and multiplication. Thus, there is no need for a transformation between the redundant and the non-redundant representations of field elements, which would be required in the classical unified architectures to realize the subtraction and comparison operations. We also quantify the benefits of the unified architectures in terms of area and critical path delay. We provide detailed implementation results. The metric shows that the new unified architecture provides an improvement over a hypothetical non-unified architecture of at least 24.88%, while the improvement over a classical unified architecture is at least 32.07%

    Optimization of 5-axis milling processes using process models

    Get PDF
    Productivity and part quality are extremely important for all machining operations, but particularly for 5-axis milling where the machine tool cost is relatively higher, and most parts have complex geometries and high quality requirements with tight tolerances. 5- axis milling, presents additional challenges in modeling due to more complex tool and workpiece interface geometry, and process mechanics. In this paper, modeling and optimization of 5-axis processes with cutting strategy selection are presented. The developed process models are used for cutting force predictions using a part-tool interface identification method which is also presented. Based on the model predictions and simulations, best cutting conditions are identified. Also, for finish process of a complex surface, machining time is estimated using three machining strategy alternatives. Results are demonstrated by example applications, and verified by experiments

    CFD Modelling of Two Different Cold Stores Ambient Factors

    Get PDF
    AbstractObjective of the research was to determine ambient temperature and relative humidity distributions of two different cold stores which have two different cooling systems. One of the cold store which is called as Cold store-I, has classical cooling system such as compressor, condenser and evaporator. Second called Cold store-II, has air conditioning system for cooling, cold air ventilation and aspiration systems, and humidification system. Computational fluid dynamics was used for modelling of distribution of temperature and relative humidity of cold store walls. Storage temperature and relative humidity were assumed 2°C and 90%, respectively. Boundary conditions were set as; Inlet-Surface of fluid inlet, Outlet-Surface of fluid outlet, and walls-solid, proof against flow of fluid. A tetrahedral mesh was created by using ANSYS 14.0 and calculation finished when accessing a solution. Turbulence was modelled using the k-ɛ (k-epsilon). Spatial distribution in two cold stores for two different cooling systems were modelled and evaluated in this research. Data determined from CFD models were compared for both cold stores. Cold store-II was better than Cold store-I because it has air distribution holes located on ceiling

    Design and implementation of a fast and scalable NTT-based polynomial multiplier architecture

    Get PDF
    In this paper, we present an optimized FPGA implementation of a novel, fast and highly parallelized NTT-based polynomial multiplier architecture, which proves to be effective as an accelerator for lattice-based homomorphic cryptographic schemes. As I/O operations are as time-consuming as NTT operations during homomorphic computations in a host processor/accelerator setting, instead of achieving the fastest NTT implementation possible on the target FPGA, we focus on a balanced time performance between the NTT and I/O operations. Even with this goal, we achieved the fastest NTT implementation in literature, to the best of our knowledge. For proof of concept, we utilize our architecture in a framework for Fan-Vercauteren (FV) homomorphic encryption scheme, utilizing a hardware/software co-design approach, in which polynomial multiplication operations are offloaded to the accelerator via PCIe bus while the rest of operations in the FV scheme are executed in software running on an off-the-shelf desktop computer. Specifically, our framework is optimized to accelerate Simple Encrypted Arithmetic Library (SEAL), developed by the Cryptography Research Group at Microsoft Research, for the FV encryption scheme, where large degree polynomial multiplications are utilized extensively. The hardware part of the proposed framework targets Xilinx Virtex-7 FPGA device and the proposed framework achieves almost 11x latency speedup for the offloaded operations compared to their pure software implementations

    Accelerating LTV based homomorphic encryption in reconfigurable hardware

    Get PDF
    After being introduced in 2009, the first fully homomorphic encryption (FHE) scheme has created significant excitement in academia and industry. Despite rapid advances in the last 6 years, FHE schemes are still not ready for deployment due to an efficiency bottleneck. Here we introduce a custom hardware accelerator optimized for a class of reconfigurable logic to bring LTV based somewhat homomorphic encryption (SWHE) schemes one step closer to deployment in real-life applications. The accelerator we present is connected via a fast PCIe interface to a CPU platform to provide homomorphic evaluation services to any application that needs to support blinded computations. Specifically we introduce a number theoretical transform based multiplier architecture capable of efficiently handling very large polynomials. When synthesized for the Xilinx Virtex 7 family the presented architecture can compute the product of large polynomials in under 6.25 msec making it the fastest multiplier design of its kind currently available in the literature and is more than 102 times faster than a software implementation. Using this multiplier we can compute a relinearization operation in 526 msec. When used as an accelerator, for instance, to evaluate the AES block cipher, we estimate a per block homomorphic evaluation performance of 442 msec yielding performance gains of 28.5 and 17 times over similar CPU and GPU implementations, respectively

    Accelerating Fully Homomorphic Encryption in Hardware

    No full text
    corecore